Protein Layers And Their Use In Electron Microscopy

ABSTRACT

Protein layers ( 1 ) repeating regularly in two dimensions comprise protein protomers ( 2 ) which each comprise at least two monomers ( 5 ), ( 6 ) genetically fused together. The monomers ( 5 ), ( 6 ) are monomers of respective oligomer assemblies ( 3 ), ( 4 ) into which the monomers are assembled to assembly of the protein layer. The first oligomer assembly ( 3 ) belongs to a dihedral point group of order O, where O equals ( 3 ), ( 4 ) or ( 6 ) and has a set of O rotational symmetry axes of order ( 2 ). The second oligomer assembly ( 4 ) has a rotational symmetry axis of order ( 2 ). Due to the symmetry of the oligomer assemblies ( 3 ), ( 4 ), the rotational symmetry axes of each second oligomer assembly ( 4 ) is aligned with one of said set of O rotational symmetry axes of a first oligomer assembly ( 3 ) with ( 2 ) protomers being arranged symmetrically therearound. Thus, an 2-fold fusion between the oligomer assemblies ( 3 ), ( 4 ) is produced and the arrangements of the rotational symmetry axes of the oligomer assemblies ( 3 ), ( 4 ) cause the protein layer to repeat regularly. The protein layer has many uses, for example to support molecular entities for biosensing, x-ray crystallography or electron microscopy.

RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.12/602,248, filed Mar. 26, 2010, which is the U.S. National Stage ofInternational Application No. PCT/GB08/01437, filed on Apr. 23, 2008,published in English, which is a continuation-in-part of U.S.application Ser. No. 11/807,922, filed May 30, 2007, now abandoned,which is a continuation-in-part of U.S. application Ser. No. 10/530,795,filed Nov. 7, 2005, now patented as U.S. Pat. No. 7,989,591, issued onAug. 2, 2011, which is the U.S. National Stage of InternationalApplication No. PCT/GB03/04306, filed on Oct. 8, 2003, published inEnglish. This application claims priority under 35 U.S.C. §119 or 365 toGreat Britain, Application No. 022323.7, filed Oct. 8, 2002. The entireteachings of the above applications are incorporated herein byreference.

SUMMARY

The present invention relates to protein layers which repeat regularlyin two dimensions. In one aspect, the protein layers are based onsymmetrical oligomer assemblies capable of self-assembly from themonomers of the oligomer assembly. The layers may have pores withdimensions of the order of nanometres to hundreds of nanometres. Theprotein layers are nanostructures which have many potential uses, forexample as a matrix to support molecular entities for electronmicroscopy, or X-ray crystallography. In another aspect, the inventionrelates to the use of protein layers for performing electron microscopy.

WO-00/68248 discloses regular protein structures based on symmetricaloligomer assemblies capable of self-assembly. In particular, WO-00/68248discloses structures formed from protein protomers (referred to as a“fusion protein” in WO-00/68248) comprising at least two monomers(referred to as “oligomerization domains” in WO-00/68248) which are eachmonomers of a respective symmetrical oligomer assembly. Self-assembly ofthe monomers into the oligomer assembly causes assembly of the regularstructures themselves. Several different types of structures aredisclosed, including discrete structures and structures extending inone, two and three dimensions.

In WO-00/68248, the relative orientations of the monomers within theprotomers are selected to provide the desired regular structure uponself-assembly. The monomers are fused together through a rigid linkinggroup which is carefully selected to provide the requisite relativeorientation of the monomers in the protomer. For example, in thelaboratory production reported in WO-00/68248, the selection of theprotomer was performed using a computer program to model monomersconnected by a linking group in the form of a continuous, interveningalpha-helical segment over a range of incrementally increased lengths.Thus, for example, the lattices suggested in WO-00/68248 having aregular structure repeating in three dimensions are formed fromprotomers comprising two monomers of respective dimeric or trimericoligomer assemblies which are symmetrical about a single rotationalaxis. The relative orientation of the two monomers is selected toprovide a specific angle of intersection between the rotational symmetryaxis of the two oligomer assemblies. Thus, there is a single fusionbetween the two oligomer assemblies and the relative orientation of theoligomer assemblies is controlled by careful selection of the linkinggroup providing the fusion. WO-00/68248 only reports laboratoryproduction of protein structures of a discrete cage and a filamentextending in one dimension.

It is expected that application of the teaching of WO-00/68248 toprotein layers repeating in two dimensions would encounter the followingdifficulties. Firstly, it is expected that there would be a difficultyin design arising from the requirement to select the relativeorientation of the monomers within the protomer appropriate forconstructing a layer. This would probably reduce the numbers of types ofoligomer assembly available to form a protein layer, and hence make itdifficult to identify suitable proteins. Secondly, it is expected thatpractical difficulties would be encountered during assembly. Thestructures disclosed in WO-00/68248 rely on the rigidity of the fusionbetween monomers in protomers which forms the single fusion betweenoligomer assemblies. WO-00/68248 teaches that the relative orientationof the monomers in the protomers controls the relative orientation ofthe oligomer assemblies in the resultant structure, so it is expectedthat flexing of the fusion away from the desired relative orientationwould reduce the reliability of self-assembly. It is expected that sucha problem would become more acute as the size of the repeating unitincreases, thereby providing a practical restriction on the reliableproduction of lattices with a relatively large pore sizes.

It would be desirable to provide protein layers having a different typeof structure in which these expected problems might be alleviated.

According to a first aspect of a present invention, there is provided aprotein layer which repeats regularly in two dimensions,

the protein layer comprising protein protomers which each comprise atleast two monomers genetically fused together, the monomers each beingmonomers of a respective oligomer assembly, the protomers comprising:

a first monomer which is a monomer of a first oligomer assemblybelonging to a dihedral point group of order O, where O equals 3, 4 or6, and having a set of O rotational symmetry axes of order 2 extendingin two dimensions; and

a second monomer genetically fused to said first monomer which secondmonomer is a monomer of a second oligomer assembly having a rotationalsymmetry axis of order 2

the first monomers of the protomers are assembled into said firstoligomer assemblies and the second monomers of the protomers areassembled into said second oligomer assemblies, said rotational symmetryaxis of said second oligomer assemblies of order 2 being aligned withone of said set of rotational symmetry axes of order 2 of one of saidfirst oligomer assemblies with two protomers being arrangedsymmetrically therearound.

As a result of using a second oligomer assembly having a rotationalsymmetry axis of the same order 2 as the set of O rotational symmetryaxes of said first oligomer assembly, the oligomer assemblies are fusedwith those symmetry axes being aligned and with 2 protomers arrangedsymmetrically therearound. This means that there is an 2-fold fusionbetween the first and second oligomer assemblies.

Furthermore the repeating pattern of the protein layer is derived fromthe arrangement of the rotational symmetry axes of the first oligomerassembly and is not dependent on the relative orientation of themonomers within the protomer. As the first oligomer assembly isdihedral, the set of O symmetry axes of order 2 are coplanar. Thereforethe protomers assemble into a layer having the same symmetry as the setof O symmetry axes.

Therefore, protein layers in accordance with the present invention maybe designed by selecting oligomers assemblies with appropriate symmetryto build a layer repeating in two dimensions. Protomers are producedcomprising monomers of the selected oligomer assemblies fused together.Subsequently, the protomers are allowed to self-assemble under suitableconditions.

To assist in understanding, reference is made to FIG. 1 whichillustrates a particular example of a protein layer 1 in accordance withthe present invention. FIG. 1 shows only a part of the protein layer 1which repeats indefinitely in two dimensions. The protein layer 1assembled from protomers 2. The protein layer 1 has a comprises a firstoligomer assembly 3 which in this example belongs to a dihedral pointgroup of order 4 and so has a set of 4 rotational symmetry axes of order2 (in addition to a single rotational symmetry axis of order 4). Each ofthe monomers 5 of the first oligomer assembly 3 is fused to a secondmonomer 6 of a second oligomer assembly 4 which in this example belongsto the dihedral point group of order 2, so having a rotational symmetryaxis of order 2. As a result, the second monomers 6 are assembled intothe second oligomer assemblies 4 arranged with their rotational symmetryaxes of order 2 aligned along the rotational symmetry axes of order 2 ofthe first oligomer assembly 3, and with a 2-fold fusion between thefirst and second oligomer assemblies 3 and 4. Thus, the symmetry of theprotein layer 1 is the same as the symmetry of the set of fourrotational symmetry axes of order 2, in this case rotational symmetry oforder 4.

Accordingly, the present invention involves the use of a different classof oligomers assemblies from that used in WO-00/68248. The presentinvention provides the benefit that one is not restricted by the need tocontrol the relative orientation of the monomers within the protomer.Thus the design of protein structure is assisted in that the relativeorientation of the monomers withing the protomer is a less criticalconstraint. Similarly, more reliable assembly of the protein layer ispossible, as described in more detail below.

According to other aspects of the present invention, there is providedan individual protomer capable of self-assembly to form such a proteinlayer, as well as polynucleotides encoding the protomer, vectors andhost cells capable of expressing the protomer and methods of making theprotomer.

It has been appreciated that a particularly advantageous use of aprotein layer which repeats regularly in two dimensions is to performelectron microscopy of a molecular entity. Thus, in accordance with asecond aspect of the present invention, there is provided a method ofperforming electron microscopy of a molecular entity, comprising:

providing a protein layer having a structure which repeats regularly intwo dimensions and which supports molecular entities each attached at apredetermined position in the repeating structure of the protein layer;and

performing electron microscopy of the protein layer having the molecularentities supported thereon to derive an image.

The method is applicable to any protein layer which repeats regularly intwo dimensions, including but not limited to a protein layer inaccordance with the first aspect of the present invention.

Thus the protein layer acts as a support for the molecular entities. Asthe molecular entities each at a predetermined position in the repeatingstructure of the protein layer, the molecular entities are supported ina regular array. This provides significant advantages in electronmicroscopy because it allows imaging of large numbers of the individualmolecular entities in known positions. This facilitates various forms ofdata analysis of the derived image, thereby allowing investigation ofthe structure of the molecular entity.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will now be described in more detail by way ofnon-limitative example with reference to the accompanying drawings, inwhich:

FIG. 1 is a schematic diagram of a protein layer

FIG. 2 is a schematic diagram of a protein layer which includes aheterologous oligomer assembly;

FIG. 3 is an electron micrograph of a specific protein layer which hasbeen prepared;

FIG. 4 is a schematic diagram of an transmission electron microscope;and

FIG. 5 is a flowchart of a method of performing electron microscopy.

DETAILED DESCRIPTION

Protein layers in accordance with the present invention may be designedby selecting oligomer assemblies which, when fused together withrotational symmetry axes of order 2 aligned with each other, produce arepeating unit which is capable of repeating in two dimensions. As thesymmetry of the repeating unit, and hence the protein layer as a whole,depends on the symmetry of the oligomer assemblies, this involves aselection of oligomer assemblies having a quaternary structure whichprovides appropriate symmetries. This is a straightforward task, becausethe symmetries of oligomer assemblies are generally available in thescientific literature on proteins, for example from The Protein DataBank; H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H.Weissig, I. N. Shindyalov & P. E. Bourne; Nucleic Acids Research, 28 pp.235-242 (2000) which is the single worldwide archive of structure dataof biological macromolecules, also available through websites such as

<URL:http://www.rcsb.org>.

In some cases, the repeating unit repeats in the same orientation acrossthe layer. In other cases, two or more adjacent repeating units togetherform a unit cell which repeats in the same orientation across the layer,but with the repeating units within a unit cell arranged in differentorientations.

Examples of oligomer assemblies which produce structures which repeatregularly in two dimensions are given below.

The first oligomer assembly belongs to a dihedral point group of orderO, where O equals 3, 4 or 6 and so has a quaternary structure withrotational symmetry axes extending in two dimensions, including a set ofO rotational symmetry axes of order 2 which are coplanar, in addition asingle rotational symmetry axis of order O which is perpendicularthereto.

The second oligomer assembly has a quaternary structure with arotational symmetry axis of the same order 2 as the set of O rotationalsymmetry axes of said first oligomer assembly. For example, the secondoligomer assembly may belong to a dihedral point group of order 2 or toa cyclic point group of order 2. The second oligomer assembly does nothave a rotational symmetry axis of order O.

In the assembled first oligomer assembly, inevitably and by definition,there are groups of first monomers arranged symmetrically around each ofthe set of O rotational symmetry axes of order 2 of the first oligomerassembly. This is because the symmetry results from the identicalmonomers being so arranged around the rotational symmetry axes.

As a result of the second monomers fused to the first oligomer assemblybeing arranged symmetrically around one of set of O rotational symmetryaxes of order 2 of the first oligomer assembly, it follows that thesecond oligomer assembly is held with the group of fused second monomersalso held symmetrically around that one of the set of O rotationalsymmetry axes of order 2 of the first oligomer assembly.

In addition, inevitably and by definition, the second monomers alsoassemble in the second oligomer assembly in a symmetrical arrangementaround the rotational symmetry axis of order 2 of the second oligomerassembly. Thus, the result of the second oligomer assembly having arotational symmetry axis of the same order 2 as the set of O rotationalsymmetry axes of the first oligomer assembly is that the first andsecond oligomer assemblies assemble with their symmetry axes of order 2aligned with one another. It follows from the symmetry of both oligomerassemblies that this is the most stable arrangement. This results in an2-fold fusion between the first and second oligomer assemblies. In eachof the first and second oligomer assemblies, there are 2 monomersarranged around the rotational symmetry axis, each of the monomers beingfused within a respective protomer to a monomer of the other oligomerassembly.

As previously mentioned, the set of rotational symmetry axes does notinclude all the rotational symmetry axes of the first oligomer assembly.Rather the set comprises the rotational symmetry axes of the firstoligomer assembly which are of the same order 2 as rotational symmetryaxis of the second oligomer assembly.

The particular choice of symmetries of the first and second oligomerassemblies results, on assembly of the protomers into the layer, in theoligomer assemblies being built up with their rotational symmetry axesaligned. Thus, the relative arrangement of the fused oligomer assembliesand hence the protein layer as a whole are therefore derived fromarrangements of the rotational symmetry axes of the first oligomerassembly and the second oligomer assembly. In particular, the proteinlayer has the same symmetry as the set of O rotational symmetry axes oforder 2. The symmetry of the protein layer is not dependent on therelative orientation of the monomers within the protomer. In otherwords, the present invention provides the advantage that the twodimensional repeating pattern of the protein layer may be based solelyon the arrangements of the rotational symmetry axes of the oligomerassemblies. This provides advantages in the design of a protein layer bymaking it easy to select appropriate oligomer assemblies for use in theprotein layer. During design, the relative orientation of the monomerswithin an individual protomer in its unassembled form becomes a muchlower constraint than is present in, for example, WO-00/68248.

There are also advantages during self-assembly of the layer. Inparticular, the formation of a 2-fold fusion between two given oligomerassemblies results in the bond between the two oligomer assemblies beingrelatively rigid. This reduces relative motion of the oligomerassemblies during the assembly process and assists in reliable formationof the layer with the oligomer assemblies in the correct relativepositions.

The form and production of the protomers will now be described. Althoughthe present invention uses protomers which are different in that theycomprise different monomers from WO-00/68248, the form and production ofthe protomers per se, as well as the polynucleotide encoding theprotomers, may be as the same as disclosed in WO-00/68248 which istherefore incorporated herein by the reference.

The nature of the monomers themselves will now be described.

The monomers are monomers of oligomer assemblies which are capable ofself-assembly under suitable conditions to produce a protein layer. Thesecondary and tertiary structure of the monomers is unimportant initself providing they assemble into a quaternary structure with therequired symmetry. However, it is advantageous if the protein is easilyexpressed and folded in an heterologous expression system (for exampleusing plasmid expression vector in E. coli).

The monomers may be naturally occurring proteins, or may be modified bypeptide elements being absent from, substituted in, or added to anaturally occurring protein provided that the modifications do notsubstantially affect the assembly of the monomers into their respectiveoligomer assembly. Such modifications are in themselves known for anumber of different purposes which may be applied to monomers of thepresent invention. In other words, the monomer may be a homologue and/orfragment and/or fusion protein of a naturally occurring protein.

The monomer may be chemically modified, e.g. post-translationallymodified. For example, it may be glycosylated or comprise modified aminoacid residues.

Although the monomers may be fused directly together, preferably themonomers are fused by a linking group of peptide or non-peptideelements. In general, linking two proteins by a linking group is knownfor other purposes and such linking groups may be applied to the presentinvention.

Another factor in the selection of appropriate oligomer assemblies isthe location and orientation of (a) the termini of the first monomerswhen arranged in the first oligomer assembly in its natural form (i.e.not fused to a second oligomer assembly) and (b) the termini of thesecond monomers when arranged in the second oligomer assembly in itsnatural form (i.e. not fused to the first oligomer assembly). Suchinformation on the arrangement of the termini in the oligomer assemblyin its natural form is generally available for oligomer assemblies, forexample from The Protein Data Bank referred to above. Ideally, thesetermini should have the same separation and orientation, because theywill be fused together in the assembled protein layer to constitute the2-fold fusion arranged symmetrically around a rotational symmetry axis.That being said, it is not essential for the separation and orientationto be the same, because any difference may be accommodated bydeformation of the monomers near the 2-fold fusion and/or by use of alinking group. Therefore, as a general point, oligomer assemblies shouldbe chosen in which the termini of both oligomer assemblies which are tobe fused together in an 2-fold fusion allows formation of the fusionwithout preventing assembly of the oligomer assemblies and hence theprotein layer.

Considering the deformation of the monomers near the 2-fold fusionmentioned above, it is desirable to minimise such deformation which willtend to reduce the reliability of the assembly process. However, if alinking group is fused between the monomers, such deformation may betaken up, at least partially, by the linking group itself. This reducesthe deformation of the monomers, thereby increasing the reliability ofself-assembly because the linking group does not take part in theassembly process as regards to not being part of the naturally occurringprotein. There is a particular advantage of the use of a linking group.

Furthermore, the linking group may be specifically designed to beoriented relative to the first and second monomers in the protomer inits normal form, prior to assembly, to reduce such differences in theposition and/or orientation of the termini of the first and secondmonomers. Using position and orientation of the termini of the first andsecond monomers in the first and second oligomer assemblies in theirnatural form which is generally available for oligomer assemblies, asdiscussed above, it is possible to design an appropriate linking groupusing conventional modelling techniques.

Typically, the monomers are fused at their end termini. Alternatively,the monomers may be fused at an alternative location in the polypeptidechain so long as the native fold and symmetry of the naturally occurringoligomer assembly remains the same. For example, one of the monomers maybe inserted into a structurally tolerant portion of the other monomer,for example in a loop extending out of the oligomer assembly. Also,truncation of a monomer is feasible and may be estimated by structuralexamination.

Some examples of symmetries for the oligomer assemblies to produce aprotein layer which repeats in two dimensions are as follows.

In these examples, the first oligomer assembly belongs to a dihedralpoint group of order O, where O equals 3, 4 or 6. Hence the firstoligomer assembly has a principal rotational symmetry axis of order Oand also O rotational symmetry axes of order 2 which all extendperpendicular to the principal rotational symmetry axis. In order todevelop a layer extending in two dimensions, the second oligomerassembly is chosen to have a rotational symmetry axis of order 2 toalign with the O rotational symmetry axes of order 2 of the firstoligomer assembly with a 2-fold fusion between the first and secondoligomer assemblies. Therefore, in this case, the O rotational symmetryaxes of order 2 constitute the set of rotational symmetry axes of thefirst oligomer assembly, ie N equals O.

In some classes of protein layer, the protomers are homologous withrespect to the monomers, ie there is a single type of protomer withinthe protein layer. In this case, the second oligomer assembly may belongto a dihedral point group of order 2.

For example, Table 1 represents some simple homologous protomers capableof forming a protein layer.

TABLE 1 Homologous Protomers Protomer M N Layer Symmetry d3d2 6 2 P622d4d2 8 2 P422 d6d2 12 2 P622

In Table 1, each protomer is identified by letters which represent theoligomer assemblies to which the respective monomers of the protomerbelong. In particular the letter d represents a dihedral point group andthe following number identifies the order of dihedral point group. Inthe next two columns of Table 1, there is given the number M of firstmonomers in the first oligomer assembly and the order N of the set ofrotational symmetry axes of the first oligomer assembly which in thiscase is 2. The final column gives the symmetry of the resulting proteinlayer. In each of these cases, the second oligomer assembly belongs to adihedral point group of order 2.

Thus it easy to visualise the protein layers. In particular, the firstoligomer assembly may be visualised as a node from which the set of Orotational symmetry axes of order 2 extend outwardly in a common plane,perpendicular to the principal rotational symmetry axis of order O. Thesecond oligomer assemblies may be visualised as linear links extendingfrom the node aligned with respective ones of the set of O rotationalsymmetry axes of order 2 of the first oligomer assemblies. In this way,it is easy to visualise the formation of the layer with pores in thespaces between the oligomer assemblies. Thus it will be seen that thesymmetry of the layer derives from the symmetrical arrangement of theset of O rotational symmetry axes of order 2 of the first oligomerassemblies.

In one type of protein layer in which the protomers are homologous withrespect to the monomers, the second oligomer assembly is a homologousoligomer assembly. In this case the protein layer consists solely of theprotomers.

In another type of such a protein layer in which the protomers arehomologous with respect to the monomers, the second oligomer assembly isa hetrologous oligomer assembly of said second monomers and of thirdmonomers. In this case, the protein layer consists of the protomers andin addition the third monomers assembled with said second monomers intosaid second oligomer assembly.

Thus, the protomer by itself cannot assemble into the entire proteinlayer. The second monomers of the heterologous oligomer assembly cannotself-assemble into the entire heterologous oligomer assembly in theabsence of in the absence of the third monomers of that heterologousoligomer assembly. This provides advantages during manufacture of theprotein layers, because first oligomer assemblies may be assembledwithout assembly of an entire protein layer which might otherwisedisrupt the production of the protomer. This allows production in atwo-stage process.

A particular heterologous oligmer assembly which may used to advantageas the second oligomer assembly is one comprising monomers which have abinding site capable of binding to biotin or a peptide, and aptamerswhich are which are capable of binding to said binding site, preferablynon-covalently. The aptamers are used as the second monomer of theprotomer. The monomers which have a binding site capable of binding tobiotin are a third monomer of the protein layer which is not geneticallyfused within a protomer. On assembly of the second oligomer assembly,the third monomers assemble to each other and the aptamers assemble intothe second oligomer assembly by each binding to a respective thirdmonomer.

This is shown schematically in FIG. 2. which shows an example of a partof the protein lattice 1 including a single second oligomer assembly 4of this type, the protein lattice otherwise repeating in the same manneras the example shown in FIG. 1. In this example, the first oligomerassembly 3 belongs to a dihedral point group of order 4 and so has a setof four rotational symmetry axes of order 2. Each of the monomers 5 ofthe first oligomer assembly 3 is fused to a second monomer 6 being anaptamer. The protein lattice 1 further comprises third monomers 7 whichare assembled together as part of the second oligomer assembly 4. Thesecond monomers 6 assemble into the second oligomer assembly 4 by eachbinding to a respective third monomer 7. Thus, in the second oligomerassembly 4, the second monomers 6 are held with the same symmetry as thethird monomers 7, but the second monomers 6 are not assembled to eachother.

This provides advantages in assisting the formation of the proteinlattice. The protein lattice 1 still has a 2-fold fusion between a firstoligomer assembly 3 and a second oligomer assembly 4, due to botholigomer assemblies 3 and 4 having a symmetry axis of order 2, asdiscussed above. However this is achieved without the second monomers 7themselves needing to assemble to each other. This assists the assemblyof the first oligomer assembly 4, in contrast the protein lattice 1shown in FIG. 1 in which both the first and second oligomer assemblies 3and 4 need to simultaneously assemble. Instead the third monomers 7assemble and the second monomers 6 each individually assemble to arespective third monomer 7.

The third monomers typically comprise a binding site. Such a bindingsite may be capable of binding to peptides or non-peptide moieties. In apreferred embodiment the binding site is capable of binding to biotin.In this case, the third monomor may be of any type having such a bindingsite, for example streptavidin, avidin or Neutravidin.

The terms “streptavidin”, “avidin” or “Neutravidin” as used herein covervariants of these molecules, unless the context requires otherwise. Suchvariants are typically homologues of the original sequences, i.e. areusually homologues of the sequences shown in SEQ ID NO:4 or SEQ ID NO:5.The variants may be fragments of the original sequences or of homologuesof the original sequences. The variant proteins may comprise additionalsequences (typically non-streptavidin, non-avidin or non-Neutravidinsequence), and thus be fusion proteins which comprise said originalsequences, homologues or fragments.

Preferably the variant sequences retain the structural properties of theoriginal sequences, such as any structural property mentioned herein.Further the variant sequences generally retain the ability to bindbiotin and/or a peptide (such as the peptide of SEQ ID NO:3). In oneembodiment the variant sequence is capable of being recognised by anantibody which is capable of recognising the original sequence. Thevariant sequences will of course retain the property of forming aprotein layer as described herein.

The second monomers which are aptamers capable of binding to the bindingsite may be any of a range of peptide tags, including without limitationstreptag I, streptag II, or nanotag. Preferred aptamers are peptideswhich are 7 to 20 amino acids long, for example 9 to 15 amino acids inlength. The aptamer may be may have homology with SEQ ID NO. 3, havingfor example at least 6, 7 or 8 amino acids in common with (i.e. the sameas) SEQ ID NO. 3.

In general the first oligomer assembly 1 may be of any type having therequired symmetry. One possible example is E. coli ALAD(delta-aminolevulinic acid dehydrogenase). Other criteria for selectionof the first oligomer assembly are set out below.

The aptamer may be fused to a terminus of the first monomer. Where theterminus is used, the first oligomer assembly should preferably possessa terminus lying close to a symmetry axis of order 2 (typically within15 {acute over (Å)}).

Alternatively, the aptamer may in general be fused at a position otherthan the terminus provided that the quaternary structure of the firstoligomer assembly properties remains substantially unaffected andprovided that the aptamer is one which does not require to be fused to aterminus. For example, Streptag I requires a free C-terminus in order tobind streptavidin. Again it is preferable for apatamer to be fused at aposition within the peptide-sequence of the first monomer resulting inthe apatamer being located in the assembled oligomer assembly at aposition lying close to a symmetry axis of order 2 (typically within 15{acute over (Å)}).

Optionally, there may be a linking group in the protomer between thefirst monomer and the second monomer which is an apatamer. Typically thelinking group might be of length in the range from 1 to 10 amino acids.The linking group might advantageous provide flexibility which assistsin the assembly of the lattice For example, in the case that the firstmonomer is E. coli ALAD and the second monomer is streptag I, it ispreferred to provide a linking group of length one or two amino acids.

Optionally, additional protein fusions may be genetically fused to anyfree termini of the first monomer or third monomer. This might be doneto permit functionalisation of the lattice. Specific non-limitativeexamples of suitable additional proteins are hexa-histidine tags,specific affinity peptides, ankyrin repeats and calmodulin, each ofwhich have been shown in the literature to be capable of genetic fusionto the N-terminus of E. coli ALAD-streptag I without affecting theability of this assembly to self-assemble into lattices.

In other classes of protein layer, the protomers are heterologous withrespect to the monomers i.e. there are two or more types of protomer inthe protein layer.

To achieve assembly of two types of protomer, the two types of protomerinclude different monomers of the same heterologous oligomer assemblywhich may belong to a cyclic point group of order 2. Thus, the firsttype of protomer comprises a first monomer which is a monomer of saidfirst oligomer assembly belonging to a dihedral point group of order O,where O equals 3, 4, or 6, genetically fused to a second monomer whichis a monomer of the second oligmer assembly, which is the hetrologousoligomer assembly belonging to a cyclic point group of order 2.Furthermore, the second type of protomer comprises a third monomer whichis a monomer of that second oligomer assembly. In the second type ofprotomer, the third monomer is genetically fused to a fourth monomerwhich is a monomer of a third oligomer assembly, the third oligomerassembly belonging to a dihedral point group of order 2 or O.

Thus when the protomers of the different types are allowed to assemble,the heterologous oligomer assemblies assemble, thereby linking theprotomers of the two types. However, a single type of protomer cannot byitself assemble into the entire protein layer. The individual monomersof the heterologous oligomer assembly cannot self-assemble into theentire heterologous oligomer assembly in the absence of the other,different monomers of that heterologous assembly. This providesadvantages during manufacture of the protein layers, because each typeof protomer may be separately produced and assembled into a respective,discrete component of the unit cell of the repeating pattern, as aresult of the monomers of the homologous first oligomer assemblyself-assembling, but without assembly of an entire protein layer. Thisis an advantage of the heterologous protomers, because assembly of thelayer may be avoided until the components are brought together.Otherwise assembly of the layer might hinder the production of theprotomers themselves. This allows production in a two-stage process.

In the simplest types of protein layer, the first oligomer assembly ofboth types of protomer is a monomer of a homologous oligomer assemblybelonging to a dihedral point group. Thus the individual types ofprotomer may. For example, Table 2 represents some simple heterologousprotomers capable of forming a protein layer.

TABLE 2 Heterologous Protomers 1st Protomer 2nd Protomer Layer ProtomerComponents M N M N Symmetry d3c2A + d3c2A* D3/D3 6 2 6 2 P622 d4c2A +d4c2A* D4/D4 8 2 8 2 P622 d6c2A + d6c2A* D6/D6 12 2 12 2 P622 d3c2A +d2c2A* D3/D2 6 2 4 2 P622 d4c2A + d2c2A* D4/D2 8 2 4 2 P622 d6c2A +d2c2A* D6/D2 12 2 4 2 P622

In Table 2, the first column identifies the two types of protomer. Eachprotomer is identified by letters which represent the oligomerassemblies to which the respective monomers of the protomer belong. Inparticular the letter d represents a dihedral point group and the letterc represents a monomer of a heterologous oligomer assembly belonging toa cyclic point group. The subscript number again represents the order ofthe point group. The subscript capital letters A and A* are used toidentify the two different monomers of the same heterologous assembly.

In Table 2, the second column identifies the point groups to which thecomponents resulting from the assembly of each type of protomer belongs.A similar notation is used as for the monomers of the protomer, exceptthat capital letters are used to indicate that the point group of thecomponent is being referred to. Thus capital letter D indicates that thecomponent belongs to a dihedral point group and the number gives theorder of the point group.

In the next four columns of Table 2, there is given the number M offirst monomers in the first oligomer assembly and fourth oligomers inthe third oligomer assembly, as well as the order N (=2) of the set of Orotational symmetry axes of the first oligomer assembly and the thirdoligomer assembly. The final column gives the symmetry of the resultingprotein layer.

In all the examples of Table 2, the first oligomer assembly of the firsttype of protomer belongs to a dihedral point group of order O, where Oequals 3, 4 or 6.

In the first three examples of Table 2, the first oligomer assembly ofthe second type of protomer belongs to a dihedral point group of orderL, where L equals O. Thus these three examples have spatially the samearrangement as the three examples of the corresponding homologousprotomers in Table 1. In the first three examples of Table 2, the firstoligomer assemblies of the two types of protomer may the same oligomerassembly or may be a different oligomer assembly.

In the second three examples of Table 2, the first oligomer assembly ofthe second type of protomer belongs to a dihedral point group of orderL, where L equals 2. These three examples have spatially the samearrangement as the three examples of the corresponding homologousprotomers in Table 1, except as follows. Instead of the two dihedraloligomer assemblies of order O being linked by a single cyclic oligomerassembly, the link between the two dihedral oligomer assemblies of orderO is extended to be formed by a chain comprising two cyclic oligomerassemblies of order 2 on either side of a dihedral oligomer assembly oforder 2. Therefore, it will be seen that the repeating unit of theheterologous oligomer assembly effectively extends the length of thelinks of the repeating unit between the dihedral oligomer assemblies oforder O which may be considered as nodes in the protein layer. Thus, thesize of the pores within the protein layer is also increased relative tothe use of the corresponding homologous protomers.

The above examples of protein layers are believed to represent thesimplest form of protomers capable of forming a protein layer and arepreferred for that reason. However, it will be appreciated that otherprotomers formed from monomers of oligomer assemblies having suitablesymmetries will be capable of forming a protein layer. For example,other homologous protomers having larger numbers of monomers than listedin Table 1 will be capable of forming a protein layer. Similarly, otherheterologous protomers will be capable of forming a protein layer. Thesemay include two types of protomer having larger numbers of monomers thanin the examples of Table 2, or may include more than two types ofprotomer.

For each of the monomers, there is a large choice of oligomer assemblieshaving the required symmetry. The present invention is not limited toparticular oligomer assemblies, because in principle any oligomerassembly having a quaternary structure with the requisite symmetry maybe used. However, as examples Table 3 lists some possible choices ofoligomer assemblies of various point groups including those in Tables 1and 2.

TABLE 3 Example oligomer assemblies Point Group Source Name of OligomerAssembly PDB Code P₃(T, 32) E. coli dps 1DPS S. epidermis EpiD 1G63P₄(O, 432) Human heavy chain ferritin 2FHA E. coli Dihydrolipoamide 1E2Osuccinyltransferase A. vinelandii Dihydrolipoamide 1EABacetyltransferase D₂ Human Mn superoxide dismutase 1AP5 P. falciparumlactate dehydrogenase 1CEQ D₃ Rat 6-pyruvoyl tetrahydropterin 1B66synthase E. coli Amino acid aminotransferase 1I1L D₄ E. coli PurE 1QCZSipunculid Hemerythrin 2HMQ worm D₆ S. typhimurium Glutamine Synthetase1F1H C_(2A) + C_(2A)* Human Casein Kinase alpha beta 1JWH chainsC_(3A) + C_(3A)* Coliphate T4 gp5 + gp27 1K28 HIV N36 + C34 1AIKPseudomonas Napthalene 1,2-Dioxygenase 1NDO putida C_(4A) + C_(4A)*Erachiopod Hemerythrin N/A

Thus the present invention provides a protein protomer or plural proteinprotomers capable of assembly into a protein layer. The monomers of theprotomer may be of any length but typically have a length of 5 to 1000amino acids, preferably at least 20 amino acids and/or preferably atmost 500 amino acids.

The invention also provides polynucleotides which encode the proteinprotomers of the invention. The polynucleotide will typically alsocomprise an additional sequence beyond the 5 and/or 3 ends of the codingsequence. The polynucleotide typically has a length of at least threetimes the length of the encoded protomer. The polynucleotide may be RNAor DNA, including genomic DNA, synthetic DNA or cDNA. The polynucleotidemay be single or double stranded.

The polynucleotides may comprise synthetic or modified nucleotides, suchas methylphosphonate and phosphorothioate backbones or the addition ofacridine or polylysine chains at the 3′ and/or 5′ ends of the molecule.

Such polynucleotides may be produced and used using standard techniques.For example, the comments made in WO-00/68248 about nucleic acids andtheir uses apply equally to the polynucleotides of the presentinvention.

The monomers are typically combined to form protomers by fusion of therespective genes at the genetic level (e.g. by removing the stop codonof the 5′ gene and allowing an in-frame read through to the 3′ gene). Inthis case the recombinant gene is expressed as a single polypeptide. Thegenes may, alternatively, be fused at a position other than the endterminus so long as the quaternary structure of the oligomer assemblyproperties remains substantially unaffected. In particular, one gene maybe inserted within a structurally tolerant region of a second gene toproduce an in-frame fusion.

The invention also provides expression vectors which comprisepolynucleotides of the invention and which are capable of expressing aprotein protomer of the invention. Such vectors may also compriseappropriate initiators, promoters, enhancers and other elements, such asfor example polyadenylation signals which may be necessary, and whichare positioned in the correct orientation, in order to allow for proteinexpression.

Thus the coding sequence in the vector is operably linked to suchelements so that they provide for expression of the coding sequence(typically in a cell). The term “operably linked” refers to ajuxtaposition wherein the components described are in a relationshippermitting them to function in their intended manner.

The vector may be for example, plasmid, virus or phage vector. Typicallythe vector has an origin of replication. The vector may comprise one ormore selectable marker genes, for example an ampicillin resistance genein the case of a bacterial plasmid or a resistance gene for a fungalvector.

Promoters and other expression regulation signals may be selected to becompatible with the host cell for which expression is designed. Forexample, yeast promoters include S. cerevisiae GAL4 and ADH promoters,S. pombe nmt1 and adh promoter. Mammalian promoters include themetallothionein promoter which can be induced in response to heavymetals such as cadmium. Viral promoters such as the SV40 large T antigenpromoter or adenovirus promoters may also be used.

Mammalian promoters, such as b-actin promoters, may be used.Tissue-specific promoters are especially preferred. Viral promoters mayalso be used, for example the Moloney murine leukaemia virus longterminal repeat (MMLV LTR), the rous sarcoma virus (RSV) LTR promoter,the SV40 promoter, the human cytomegalovirus (CMV) IE promoter,adenovirus, HSV promoters (such as the HSV IE promoters), or HPVpromoters, particularly the HPV upstream regulatory region (URR).

Another method that can be used for the expression of the proteinprotomers is cell-free expression, for example bacterial, yeast ormammalian.

The invention also includes cells that have been modified to express theprotomers of the invention. Such cells include transient, or preferablystable higher eukaryotic cell lines, such as mammalian cells or insectcells, using for example a baculovirus expression system, lowereukaryotic cells, such as yeast or prokaryotic cells such as bacterialcells. Particular examples of cells which may be modified by insertionof vectors encoding for a polypeptide according to the invention includemammalian HEK293T, CHO, HeLa and COS cells. Preferably the cell lineselected will be one which is not only stable, but also allows formature glycosylation of a polypeptide. Expression may be achieved intransformed oocytes.

The protein protomers, polynucleotides, vectors or cells of theinvention may be present in a substantially isolated form. They may alsobe in a substantially purified form, in which case they will generallycomprise at least 90%, e.g. at least 95%, 98% or 99%, of the proteins,polynucleotides, cells or dry mass of the preparation.

The protomers may be prepared using the vectors and host cells usingstandard techniques. For example, the comments made in WO-00/68248regarding methods of preparing protomers (referred to as “fusionproteins” in WO-00/68248) apply equally to preparation of protomersaccording to the present invention.

Assembly of the protein layer from the protomers may be performed simplyby placing the protomers under suitable conditions for self-assembly ofthe monomers of the oligomer assemblies. Typically, this will beperformed by placing the protomers in solution, preferably an aqueoussolution. Typically, the suitable conditions will correspond to those inwhich the naturally occurring protein self-assembles in nature. Suitableconditions may be those specifically disclosed in WO-00/68248.

In the case of homologous protomers this results in direct assembly ofthe protein-layer.

In the case of heterologous protomers, assembly is preferably performedin plural stages. In a first stage, each type of protomer is separatelyassembled into a respective discrete component. In a second stage, thediscrete components are brought together and assembled into the proteinlayer. Where plural heterologous protomers are used, there may befurther stages intermediate the first and second stage in which therespective discrete components are brought together and assembled intolarger, intermediate components.

There will now be described a method by which there has been prepared aspecific protein layer which is an example of the type shown in FIG. 2

The protomers consisted of a first monomer being E. coli ALAD and asecond monomer being streptag I. The third monomer was streptavidin.

The protomers were prepared in an E. coli plasmid vector using standardtechniques. The E. coli plasmid vector was a derivative of pUC19 havingthe sequence SEQ ID NO. 1. The sequence of the protomer is:

TABLE 4 (SEQ ID NO. 2)MTMGSMTDLIQRPRRLRKSPALRAMFEETTLSLNDLVLPIFVEEEIDDYKAVEAPGVMRIPEKHLAREIERIANAGIRSVMTFGISHHTDETGSDAWREDGLVARMSRICKQTVPEMIVMSDTCFCEYTSHGHCGVLCEHGVDNDATLENLGKQAVVAAAAGAXFIAPSAAMDGQVQAIRQALDAAGFKDTAIMSYSTKFASSFYGPFREAAGSALKGDRKSYQMNPMNRREAIRESLLDEAQGANCLMVKPAGAYLDIVRELRERTELPIGAYQVSGEYAMIKFAALAGAIDEEKVVLESLGSIKRAGADLIFSYFALDLAEKKILRRSAWRHPQFGG

The sequence of Streptag I is:

(SEQ ID NO. 3) AWRHPQFGG

The sequence of streptavidin (as used in the work described herein) is:

(SEQ ID NO: 4) MET GLU ALA GLY ILE THR GLY THR TRP TYR ASN GLNLEU GLY SER THR PHE ILE VAL THR ALA GLY ALA ASPGLY ALA LEU THR GLY THR TYR GLU SER ALA VAL GLYASN ALA GLU SER ARG TYR VAL LEU THR GLY ARG TYRASP SER ALA PRO ALA THR ASP GLY SER GLY THR ALALEU GLY TRP THR VAL ALA TRP LYS ASN ASN TYR ARGASN ALA HIS SER ALA THR THR TRP SER GLY GLN TYRVAL GLY GLY ALA GLU ALA ARG ILE ASN THR GLN TRPLEU LEU THR SER GLY THR THR GLU ALA ASN ALA TRPLYS SER THR LEU VAL GLY HIS ASP THR PHE TRR LYSVAL LYS PRO SER ALA ALA SER.

For reference the sequence of avidin is:

(SEQ ID NO: 5) SNEIKESPLH GTQNTINKRT QPTFGFTVNW KFSESTTVFTGQCFIDRNGK EVLKTMWLLR SSVNDIGDDW KATRVGINIF TRLRTQE.

The gene encoding 5-Aminolaevulinic acid dehydratase (ALAD) wasamplified from DH5alpha genomic DNA and inserted into theDsRed-Express-streptagI expression vector described above to replace theDsRed-Express gene cassette.

An ALAD-streptagI protomer was then prepared. 0.1 mM IPTG was includedin the expression medium. Induction of expression was as follows: a 10ml overnight culture of the expression strain (in LB broth containing 30.μg/ml Kanamycin) was diluted 1:100 into fresh LB broth containing 30m/ml Kanamycin, Cells were grown with shaking at 37° C. to a densitycorresponding to an OD₆₀₀ of 0.6 and were then induced to express thetarget protein by the addition of IPTG to a final concentration of 1 mM.The culture was maintained at 37° C. with shaking for a further 3 hoursbefore the cells were harvested by centrifugation (5000 g, 10 min, 4°C.). The cell pellet was resuspended in 20 ml of buffer A (300 mM NaCl,1 mM EDTA, 50 mM HEPES, pH7.5). Cells were lysed by sonication and theinsoluble fraction harvested by centrifugation (25,000 g, 30 min, 4°C.). This fraction was dissolved in 8M urea and centrifuged (25,000 g,30 min, 4° C.) to remove insoluble particles. The urea solubilisedmaterial was concentrated to 16 mg/ml and passed through a 0.22 μmfilter. A drop of this material (1 μl) was then directly injected into alarger drop (5 μl) of buffer A.

In general many expression and purification options are available.Another repeatedly successful protocol is as follows:

1. A single colony of BL21 (DE3) Star E. coli was transferred from anLuria-Bertani Agar plate to 500 ml of Luria-Bertani medium containing 75μg/ml ampicillin and 0.1 mM isopropylthio-beta-D-galactopyranoside(IPTG).2. This culture was incubated with shaking for 18 hrs at 37° C.3. The culture was harvested by centrifugation (5,000 g, 5 min) andresuspended in 10 ml of buffer “GF” (150 mM NaCl, 50 mM Tris-HCl, 1 mMEDTA, 0.02% sodium azide, pH8.0).4. Cells were lysed using either sonication, freeze thaw, cells lysisreagents (e.g. “Bugbuster”), or lysozyme and DNAse treatment. These aretechniques standard in the art.5. The insoluble fraction was removed by centrifugation (30,000 g, 30min).6. The fusion protein was purified from the soluble fraction usingStrep-tactin sepharose (IBA GmbH) according to the manufacturersinstructions.7. Eluted protein was separated from the desthiobiotin contaminationthat results from the Strep-tactin column by mean of size exclusionchromatography using a superose 6 matrix and buffer GF.8. Purified protein could be stored at 4° C. for at least 6 months.

The purified ALAD-streptag I protomer (˜1 mg/ml) was mixed withcommercially available core streptavidin in equimolar amounts.Self-assembly commenced immediately and the resultant protein latticeswere visualised by means of transmission electron microscopy. FIG. 3shows a negatively stained transmission electron micrograph of theprotein lattice, the unit cell size being 13 nm×13 nm. Image processingof the electron micrographs was performed to enhance the image quality.In particular the electron micrograph was Fourier transformed, filteredusing a space group derived filter and averaging, and thenreconstructed.

Protein layers in accordance with the present invention have numerousdifferent uses. In general, such uses will take advantage of the regularrepeating structure and/or the pores which are present within thestructure. Layers in accordance with the present invention may bedesigned to have pores with dimensions expected to be of the order ofnanometres to hundreds of nanometres. Layers may be designed with anappropriate pore size for a desired use.

The highly defined, unusually sized and finely controlled pore sizes ofthe protein lattices or layers together with the stability of theirstructures make them ideal for applications requiring microporousmaterials with pore sizes in the range just mentioned. As one example,the lattices or layers are expected to be useful as a filter element ormolecular sieve for filtration or separation processes. In this use, thepore sizes achievable and the ability to design the size of a pore areparticularly advantageous.

In another class of use, molecular entities would be attached to theprotein layer. Such attachment may be done using conventionaltechniques. The molecular entities may be any entities of an appropriatesize, typically a macromolecular entity, for example proteins,polynucleotides, such as DNA, or non-biological entities. The molecularentities may be a single molecule or a complex of plural molecules. Assuch, the protein layers are expected to be useful as biologicalmatrices for carrying molecular entities, for example for use in drugdelivery, or for crystallizing molecular entities.

Attachment of the molecular entities to the protein layer may beperformed in a number of ways.

Some approaches involve “tagging” either or both of the proteinprotomers (or other component of the layer) or the molecular entities ofinterest. In this context, tagging is the covalent addition to either orboth of the protein protomers (or other component of the layer) or tothe target molecular entities, of a structure known as a tag or affinitytag which forms strong interactions with a target structure. Typically,short peptide motifs (e.g. heterodimeric coiled coils such as the“Velcro” acid and base peptides) are used for this purpose. In the caseof the protein protomer (or other component of the layer), or amolecular entities which is a protein, this may be achieved bygenetically fusing the tag to a component of the protein layer or themolecular entity, that is the expression of a genetically modifiedversion of the protein to carry an additional sequence of peptideelements which constitute the tag, for example at one of its termini, orin a loop region. Alternative methods of adding a tag include covalentmodification of a protein after it has been expressed, throughtechniques such as intein technology.

In one approach, the target structure may be a further tag attached tothe other of the protein protomer or target molecular entity, ie both ofa component of the layer and target molecular entity includecomplementary affinity tags for attachment to each other.

In another approach, the target structure may be a part of the proteinprotomer (or other component of the layer) or target molecular entity,ie one of a component of the layer and target molecular entity has anaffinity tags which has an affinity to the other of a component of thelayer and target molecular entity. Thus, to attach the molecular entityto the protein layer, a component of the layer may include, at apredetermined position in the protomers, an affinity tag attached to themolecular entity of interest. Alternatively, the molecular entity ofinterest may have at a predetermined position in the molecular entity,an affinity tag attached to a component of the layer.

When a component of the protein layer is known to form stronginteractions with a known peptide sequence, that peptide sequence may beused as a tag to be added to the target molecular entity. Where no suchtight binding partner is known, suitable tags may be identified by meansof screening. The types of screening possible are phage-displaytechniques, or redundant chemical library approaches to produce a largenumber of different short (for example 3-50 amino acid) peptides. Thetightest binding peptide elements may be identified using standardtechniques, for example amplification and sequencing in the case ofphage-displayed libraries or by means of peptide sequencing in the caseof redundant libraries.

An alternative approach is for the target molecular entity itself to beexpressed as a direct genetic fusion to a component of the layer.

Another alternative approach is to make specific chemical modificationsof the lattice in order to provide alternative affinity-based orcovalent means of attachment. For example, the site-specificderivitization of accessible sulphydryl groups in the lattice may beused for the incorporation of nitrilo-triacetic acid (NTA) groups whichin turn may be used for binding of metal ions and hence histidine richtarget proteins.

To attach the molecular entity to the protein layer using an affinitytag on the layer or the molecular entity, the molecular entity may beallowed to diffuse into, and hence become attached to, a pre-formedprotein layer, for example by annealing of the bound molecular entityinto their lowest energy configurations in the protein layer may beperformed using controlled cooling in a liquid nitrogen cryostream.Alternatively, the molecular entities may be mixed with the protomersduring formation of the protein layer to assemble with the layer.

In another class of uses, proteins having useful properties could beincorporated as one of the protomers.

A use in which an entity is attached to the protein layer is to performX-ray crystallography of the molecular entities. In this case, theregular structure of the protein layer allows the molecular entities tobe held at a predetermined position relative to a repeating structure,so that they are held in a regular array and in a regular orientation.X-ray crystallography is important in biochemical research and rationaldrug design.

The protein layer having an array of molecular entities supportedthereof may be studied using standard x-ray crystallographic techniques.Use of the protein layer as a support in x-ray crystallography isexpected to provide numerous and significant advantages over currenttechnology and protocol for X-ray crystallography, including thefollowing:

(1) Significantly lower amounts of molecule will be required (probablyof order micrograms rather than milligrams). This will allowdetermination of some previously intractable targets.(2) Use of affinity tags will allow structure determination without thetypical requirement for a number of purification steps.(3) There will be no need to crystallize the molecular entity. This is adifficult and occasionally insurmountable step in traditional X-raystructure determination.(4) There will be no need to obtain crystalline derivatives for eachnovel crystal structure to obtain the required phase information. Sincethe majority of scattering matter will be the known protein layer ineach case, determination of the structure may be automated and achievedrapidly by a computer user with little or no crystallographic expertise.(5) The complexes of a protein with chemicals (substrates/drugs) andwith other proteins can be examined without requiring entirely newcrystallization conditions.(6) The process is expected to be extremely rapid and universallyapplicable, which will provide enormous savings in time and costs.

For use in catalysing biotransformations, enzymes may be attached to theprotein layer, or incorporated in the protein layer.

For use in data storage, it may be possible to attach a protein which isoptically or electronically active. One example is Bacteriorhodopsin,but many other proteins can be used in this capacity. In this case, theprotein layer holds the attached protein in a highly ordered array,thereby allowing the array to be addressed. The protein layer mightovercome the size limitations of existing matrices for holding proteinsfor use in data storage.

For use in a display, it may be possible to attach a protein which isphotoactive or fluorescent. In this case, the protein layer holds theattached protein in a highly ordered array, thereby allowing the arrayto be addressed for displaying an image.

For use in charge separation, a protein which is capable of carrying outa charge separation process may be attached to the protein layer, orincorporated in the protein layer. Then the protein may be induced tocarry out the separation, for example biochemically by a “fuel” such asATP or optically in the case of a photoactive centre such as chlorophyllor a photoactive protein such as rhodopsin. A variety of chargeseparation processes might be performed in this way, for example ionpumping or development of a photo-voltaic charge.

For use as a nanowire, a protein which is capable of electricalconduction may be attached to the protein layer, or incorporated in theprotein layer. Using an anisotropic protein layer, it might be able toprovide the capability of carrying current in a particular direction.

For use as a motor, proteins which are capable of inducedexpansion/contraction may be incorporated into the protein layer.

The protein lattices may be used as a mould. For example, silicon couldbe diffused or otherwise impregnated into the pores of the proteinlattice, thus either partially or completely filling the latticeinterstices. The protein material comprising the original lattice may,if required, then be removed, for example, through the use of ahydrolysing solution.

Another use in which an entity is attached to the protein layer is toperform electron microscopy of the molecular entities. This may beperformed to determine the structure of the entities. The entities maybe of any type including a macromolecule (e.g. a protein or DNA) or amacromolecular complex (e.g. a complex of a macromolecule with one ormore other molecular species).

There will first be described known electron microscopy techniques byway of background.

FIG. 4 schematically shows a transmission electron microscope 10arranged as follows. An electron source 11 produces electrons. Anobjective lens system 12 directs a beam of electrons from the source 11onto a sample 13. An imaging lens system 14 directs electronstransmitted through the sample 13 onto a sensor 15 which produces animage. The image may be a focussed image or may be a diffractionpattern, the latter being useful where the entity is presented in aregular array (e.g. tubes of molecules, 2D crystals, or helical arrays).Information from multiple images, corresponding to multiple differentviews of the molecular species, may be subsequently combined to producea 3D reconstruction.

Sample preparation and presentation within the microscope is performedas follows. In practise, samples 13 are presented to the electron beamwithin the sample holder of an electron microscope. Samples 13 aregenerally mounted on a copper grid. This may have been coated with athin layer of deposited carbon that may in turn be either continuousacross the holes of the grid, or may be deliberately incomplete so as toleave holes in which the sample floats (a “lacey” carbon layer).

Details of the sample mounting protocol depend on whether or not thesample is to be visualised under cryo-conditions.

For cryo conditions, the sample 13 may be introduced into a medium thatis augmented with a cryoprotectant agent so as to minimise the tendencyto form ice at low temperatures. Examples of cryoprotectant agentsinclude glucose and trehalose. In addition, a contrast-enhancingconstituent may be added to the sample 13 environment. An example of acontrast enhancing agent is tannin. After cryoprotection, the sample 13is introduced onto the (possibly coated) copper grid, excess sample andembedding medium are withdrawn by blotting so as to produce a sample 13no thicker than 1000 {acute over (Å)}, and the grid is introduced intoan environment at cryo-temperatures (<200K). The speed of cooling is animportant factor in avoiding the formation of ice and consequent sampledamage during freezing. Rapid cooling may be achieved by plunging thesample 13 into liquid nitrogen, into a stream of gaseous nitrogen attemperatures below 120K, or into a bath of a less volatile liquid (suchas propane) at cryo temperatures. Mechanical stages may be used toensure a rapid and reproducible introduction of the copper grid into thefreezing environment.

Where samples 13 are not to be presented in vitreous frozen solution(i.e. under non-cryo conditions), a solution of the substance to beimaged is introduced onto a carbon-coated copper grid, a period of timeis left for sample to adsorb to the carbon layer, and then excess sampleand solution are withdrawn by blotting. To enhance the contrast ofimages, and to minimise the deleterious consequences of radiationdamage, the sample 13 may then be stained. Since biological samplesdemonstrate intrinsically low scattering, the stains used are generallythemselves electron dense, and hence strongly scattering. Thus thestains used are generally “negative stains”: the images recorded aredark where the stain is, and are lighter in regions from which stain isexcluded by the presence of the sample. Uranyl acetate is an example ofa negative stain.

Data collection is performed as follows.

In the case of deriving a focussed image, images are in fact recordedaway from perfect focus. While this is done to generate contrast in theimage, it results in a degradation of the image. Specifically, Fourierterms calculated from the image are modulated by a “Contrast TransferFunction” (CTF), which modulates the amplitudes of Fourier terms in amanner that is a function of the corresponding scattering angle.Corrected Fourier terms can generally be recovered by appropriatescaling once the extent of defocus and astigmatism have beencharacterised. At a given defocus, the CTF will adopt a value of zerofor Fourier terms corresponding to particular scattering angles. Theseterms cannot, therefore, be recovered by post processing. To fill in thecorresponding holes in reciprocal space, images are recorded at a rangeof defocuses, so that Fourier terms that are modulated to (or close to)zero in an image recorded at one defocus will have a measurableamplitude at another defocus.

Inelastic interaction of electrons with the sample results in depositionof energy that, in turn, causes damage to the sample. This damagedegrades the structure of the molecules within the sample. For thisreason, images and diffraction patterns are recorded using a relativelylow dose of electrons. This experimental limitation means that there isa relatively poor signal to noise ratio in the recorded images of eachmolecular species captured within the field of view of an image. Thistranslates to each image carrying relatively low resolution informationabout the structure of sample. In general, enhancement of the signal tonoise is achieved by effectively averaging the images of multiplemolecules that are observed in the same (or similar) orientations withrespect to the electron beam.

Each image approximates to a projection of the electron density (or moreprecisely the potential) distribution of the molecular species. Hence, asingle image of a single molecule does not contain sufficientinformation to infer the 3D structure of that molecule. Therefore,images have to be recorded from the sample in multiple orientations withrespect to the beam.

For periodic structures, Fourier components can be measured directly byrecording the diffraction pattern, rather than an image of the sample.This approach avoids the complication of modulation by the CTF althoughother characteristics of the experiment and of the instrument must stillbe corrected for in post-processing. For periodic samples (e.g. 2Dcrystals or helical arrays), scattering becomes concentrated intodiscrete directions that are characteristic of the size and shape of therepeated unit (i.e. the unit cell), giving rise to diffraction spots inthe scattering pattern, rather than a continuous scattering function.This process of “Bragg amplification” makes for readily recordablesignals. A further advantage of recording the diffraction pattern isthat the intensities of the scattered pattern (i.e. that property whichis recorded) are independent of global motions of the sample during theexposure. Such motions can be caused by thermal fluctuation as well asspecific heating and charging of the sample caused by the electron beam.A disadvantage of recording the scattered pattern rather than focussedelectrons (i.e. a diffraction pattern rather than an image) is thatrecording of the scattered pattern loses phase information for theFourier terms. At the same time, local imperfections in a can becorrected if an image thereof is collected, but not (trivially) if adiffraction pattern is collected.

In the case of electron tomography, a single example of the species tobe visualised is imaged with extremely low dose at a range oforientations. Hence a single molecular species is imaged. This addressesa potential criticism of other approaches: each representative moleculeof a sample might be subtly different, which makes both the averaging ofmultiple images and 3D reconstruction inappropriate. It has thedisadvantage that the electron dose that can be tolerated by a singlespecies is spread over imaging in multiple orientations: this ultimatelylimits the resolution of 3D reconstruction that can be achieved.

Data analysis is performed as follows.

The protocol used to analyse data from electron microscopy dependsprimarily on whether the sample is periodic (i.e. 2D crystalline orpresented in a helical array), or aperiodic, i.e. presented as isolatedparticles which may or may not have local rotational symmetry, but whichlack significant translational symmetry. In both cases, where image(rather than diffraction data) have been collected, the defocus andastigmatism of the sample are identified by analysis of the intensitydistribution of Fourier transformed regions of the image. Based on thesevalues, which may vary across the image, an appropriate correction canbe calculated to compensate for CTF effects.

One type of data analysis is single particle reconstruction. This allowsreconstruction of a three-dimensional (3D) image from images ofindividual entities, as follows.

For non-periodic samples, images (rather than diffraction patterns) arerecorded. Analysis begins with locating samples on the recorded image.For unstained biological molecules this presents a significant problem:the inherently low signal-to-noise ratio means that molecules may not beapparent against background. Even if they are visible such molecules maybe so poorly imaged as to preclude the characterisation of theirorientation compared to other images of the same molecular species. Thisproblem is made worse where the species to be visualised is small. Inpractise, it is not readily possible to apply conventional EM tonon-crystalline samples of macromolecules (or macromolecular complexes)with a combined molecular weight less that ˜125 kDa.

After locating multiple molecular species to assemble a “dataset” of(noisy) images, the next stage is classification. In this step, imagesof the molecular species are grouped, so that those that representsimilar views are associated with each other. Particularly where acarbon support has been used, there may be a limited set of such viewspresent in the dataset. Images of particles that fall within suchclusters are averaged to provide “class averages”. The relativeorientations of a set of class averages is determined by means of a“common lines” or similar approach. Ultimately, this allows theinformation from multiple different views to be assembled in reciprocalspace so as to permit 3D reconstruction.

Another type of data analysis is two-dimensional (2D) crystallographicanalysis. This is applicable to periodic samples. Data may have beencollected as images or as diffraction patterns.

In images of a crystalline lattice, recognition of the geometry andlocation of the lattice provides a readily exploited means of predictingthe location of the multiple copies of the species to be imaged.Averaging can be performed either in real space (where individual unitcell images are summed) or in reciprocal space. In the latter approach,the image is Fourier transformed to produce a set of diffraction spotsthat result from scattering by that part of the image which has aperiodic character, i.e. by the ordered array of molecules. The rest ofthe scattering (i.e. that intensity which does not fall at the positionof diffraction spots) comes from background and noise. The multiple unitcells in the field of view can therefore be averaged by setting alloff-peak intensities to zero and carrying out a further Fouriertransformation. This process is called Fourier filtering. Both realspace averaged and Fourier Filtered images can be enhanced by a processof “unbending” In this process, local distortions of the lattice can beidentified (generally by an autocorrelation method), and used to correctthe image to generate a picture that would prevail if the lattice werenot subject to any local distortion.

Diffraction patterns of the crystalline lattice can be used to measuredirectly the amplitudes of Fourier components. Phases for these termscan be established only using methods analogous to those used in proteincrystallography. These include isomorphous replacement (IR), molecularreplacement (MR), and density modification (DM). For IR, diffraction hasto be measured before and after the addition (or substitution) of a partof the structure. For MR, a known structure or electron densitydistribution can be used to calculate phases for the unknown structure.For DM, phases for low resolution terms must be available (e.g. fromanalysis of images as above), and phasing of high resolution terms isachieved by iterative imposition of averaging and solvent flattening,including increasingly high resolution terms into the process as phaseis extended.

The disposition of the crystal with respect to the beam can be inferredfrom the apparent geometry of diffraction spots, which may either berecorded directly or calculated by Fourier transformation of an image,provided that the geometry of the repeating unit in the crystal (theunit cell geometry) is known. Where the structure of a significant partof the lattice is known, a calculated image of this part of the latticecan also be used to assess the orientation of the lattice in anexperimentally recorded image. Thus information from multiple images inmultiple orientations, collected at multiple tilt angles can readily becombined to carry out 3D image reconstruction.

The application to imaging of molecular entities supported on a proteinlattice will now be described. Benefits are achieved because theentities are each supported at a predetermined position in the repeatingstructure of the protein layer.

There may be used a conventional transmission electron microscope 10,for example as shown in FIG. 4. Imaging is performed using the methodshown in FIG. 5.

First in step S1, there is prepared a protein lattice having themolecular entities attached thereto. This is done using the techniquesdescribed above. A sample 13 for the transmission electron microscope 10is prepared with the protein lattice using standard procedures, asdiscussed above.

Two approaches for attaching the entity to the protein layer are asfollows.

In the first approach, the entity is added to a solution (or suspension)containing the protein layer. Thus the entities attach to the layer insolution. The resultant layer is then subjected to sample preparation asdescribed above for either cryo electron microscopy or for non-cryoelectron microscopy, either with or without staining

In the second approach, the protein layer is first deposited onto thecarbon layer of a coated copper grid to form the sample holder of theelectron microscope 10. The entity is introduced subsequently. In thiscase, a suspension of the protein lattice is placed on the carbon-coatedgrid, adsorption is allowed to occur, excess crysalin and surroundingsolution are removed, and a solution of the target species isintroduced. After an incubation in which binding of the target to thecrysalin occurs, excess target and surrounding solution is removed.Subsequent sample preparation is as described above for either cryoelectron microscopy or for non-cryo electron microscopy

For optimal resolution in the structure of the molecular entity, it ispreferable for the molecular entities to be aligned with identicalorientations with respect to every axis. In step S2 which is optional,the molecular entities are aligned with respect to the protein lattice.

Two possible methods of molecular alignment which may be implemented,either independently or in combination, are as follows.

A first alignment method is to apply an electric field with a vectorparallel to the principal symmetry axis of the “first” protein layercomponent in order to align the molecular entities by virtue of theirintrinsic or induced dipoles.

A second alignment method takes advantage of polar and/or hydrophobicinteractions between molecular entities and the protein layer through aprocess of thermal annealing during which the target molecules areslowly cooled to identical minimum energy conformations.

In step S3, imaging is performed to derive an image. Such datacollection is conducted using standard protocols, for example asdescribed above for conventional EM. By way of example images may becollected at a series of defocus steps and also employing the tilt-stageof the microscope to image the lattice through a range of angles. Whereorientation of the target molecules has been successful, a series ofelectron diffraction images may also be usefully collected.

In step S4, data analysis of the images is performed. A variety of dataanalysis techniques may be applied, as follows.

Where it has been possible to impose an approximately common orientationof each bound target molecule with respect to the underlying lattice, a2D crystallographic data analysis may be performed, as described above.This allows a 3D reconstruction of the target molecule to be derived.

Single particle image reconstruction tools can also theoretically beapplied to image reconstruction of 2D periodic arrays, and where thisprovides improved image reconstruction, that approach is also taken toimage protein layers and attached molecular entities. Hybrid methods,whereby some computational techniques of 2D crystallography are combinedwith computational techniques of single particle image analysis, arealso used where this is suitable.

Where it has not been possible to impose an approximately commonorientation of each bound target molecule, a combination of the methodsoutlined above for single particle 3D reconstruction and 2Dcrystallography are applied. In this combination, the components of theprotein lattice itself are identified and subtracted from the image.

The components of the protein lattice may be derived as described abovefrom an analysis of one or more recorded images of a protein layer andattached molecular entities. Alternatively, the components of theprotein lattice may be derived from a reference image acquiredseparately or being a stored image acquired previously.

This allows the lattice components of each image are identified to beremoved. The resulting difference image is an image of the entities inisolation that would have been recorded if the entities were disposed inspace at positions having the same repeating pattern as the structure ofthe protein layer, albeit in a partially random orientation.

Thereafter single particle reconstruction is performed, as describedabove. This process is expedited by the fact that the protein layer willbe found at readily predicted positions on the image, as a consequenceof their binding to known locations on the protein layer, the locationand orientation of which is readily identified. The subtraction of thereference image effectively accomplishes the first step of singleparticle 3D reconstruction (particle picking) as described above.Similarly, a degree of alignment of the molecules is likely to apply andcontributes to particle classification.

Variants

Homologues of protein sequences are referred to herein. Such homologuestypically have at least 70% homology, preferably at least 80, 90%, 95%,97% or 99% homology, for example over a region of at least 15, 20, 30,100 more contiguous amino acids. The homology may be calculated on thebasis of amino acid identity (sometimes referred to as “hard homology”).

For example the UWGCG Package provides the BESTFIT program which can beused to calculate homology (for example used on its default settings)(Devereux et at (1984) Nucleic Acids Research 12, p 387-395). The PILEUPand BLAST algorithms can be used to calculate homology or line upsequences (such as identifying equivalent or corresponding sequences(typically on their default settings), for example as described inAltschul S. F. (1993) J Mol Evol 36:290-300; Altschul, S, F et at (1990)J Mol Biol 215:403-10.

Software for performing BLAST analyses is publicly available through theNational Center for Biotechnology Information(URL:http://www.ncbi.nlm.nih.gov/). This algorithm involves firstidentifying high scoring sequence pair (HSPs) by identifying short wordsof length W in the query sequence that either match or satisfy somepositive-valued threshold score T when aligned with a word of the samelength in a database sequence. T is referred to as the neighbourhoodword score threshold (Altschul et al, supra). These initialneighbourhood word hits act as seeds for initiating searches to findHSPs containing them. The word hits are extended in both directionsalong each sequence for as far as the cumulative alignment score can beincreased. Extensions for the word hits in each direction are haltedwhen: the cumulative alignment score falls off by the quantity X fromits maximum achieved value; the cumulative score goes to zero or below,due to the accumulation of one or more negative-scoring residuealignments; or the end of either sequence is reached. The BLASTalgorithm parameters W, T and X determine the sensitivity and speed ofthe alignment. The BLAST program uses as defaults a word length (W) of11, the BLOSUM62 scoring matrix (see Henikoff and Henikoff (1992) Proc.Natl. Acad. Sci. USA 89: 10915-10919) alignments (B) of 50, expectation(E) of 10, M=5, N=4, and a comparison of both strands.

The BLAST algorithm performs a statistical analysis of the similaritybetween two sequences; see e.g., Karlin and Altschul (1993) Proc. Natl.Acad. Sci. USA 90: 5873-5787. One measure of similarity provided by theBLAST algorithm is the smallest sum probability (P(N)), which providesan indication of the probability by which a match between two amino acidsequences would occur by chance. For example, a sequence is consideredsimilar to another sequence if the smallest sum probability incomparison of the first sequence to the second sequence is less thanabout 1, preferably less than about 0.1, more preferably less than about0.01, and most preferably less than about 0.001.

The homologous sequence typically differ by at least 2, 5, 10, 20 ormore mutations (which may be substitutions, deletions or insertions ofamino acids). The homologous sequence typically differ by at most 5, 10,20 or more mutations (which may be substitutions, deletions orinsertions of amino acids). Typically, up to 40% of the amino acids ofthe sequence are mutated. These mutation may be measured across any ofthe regions mentioned above in relation to calculating homology. Thesubstitutions are preferably conservative substitutions. These aredefined according to the following Table. Amino acids in the same blockin the second column and preferably in the same line in the third columnmay be substituted for each other:

ALIPHATIC Non-polar G A P I L V Polar - uncharged C S T M N QPolar - charged D E K R AROMATIC H F W Y

What is claimed is:
 1. A method of performing electron microscopy of amolecular entity, the method comprising: providing a protein layerhaving a structure which repeats regularly in two dimensions, theprotein layer comprising protein protomers which each comprise at leasttwo monomers genetically fused together, the monomers each beingmonomers of a respective oligomer assembly, the protomers comprising: afirst monomer which is a monomer of a first oligomer assembly belongingto a dihedral point group of order O, where O equals 3, 4 or 6, andhaving a set of O rotational symmetry axes of order 2 extending in twodimensions; and a second monomer genetically fused to said first monomerwhich second monomer is a monomer of a second oligomer assembly having arotational symmetry axis of order 2, the first monomers of the protomersare assembled into said first oligomer assemblies and the secondmonomers of the protomers are assembled into said second oligomerassemblies, said rotational symmetry axis of said second oligomerassemblies of order 2 being aligned with one of said set of rotationalsymmetry axes of order 2 of one of said first oligomer assemblies withtwo protomers being arranged symmetrically therearound, the proteinlayer supporting molecular entities each attached to the protein layerat a predetermined position in the repeating structure of the proteinlayer; and performing electron microscopy of the protein layer havingthe molecular entities supported thereon to derive an image.
 2. A methodaccording to claim 1, wherein said step of providing a protein layerwhich supports molecular entities comprises making the protein layer andsubsequently attaching the molecular entities thereto.
 3. A methodaccording to claim 2, wherein the step of attaching the molecularentities to the protein layer is performed in solution.
 4. A methodaccording to claim 1, further comprising, prior to the step ofperforming electron microscopy, aligning the molecular entities withrespect to the protein layer.
 5. A method according to claim 4, whereinthe step of aligning the molecular entities with respect to the proteinlattice comprises applying an electric field to the protein layer.
 6. Amethod according to claim 5, wherein the step of aligning the molecularentities with respect to the protein layer comprises cooling the proteinlattice to a minimum energy state.
 7. A method according to claim 1,further comprising performing data analysis of the image.
 8. A methodaccording to claim 7, wherein the data analysis is a two-dimensionalcrystallographic data analysis.
 9. A method according to claim 7,wherein the data analysis comprises identifying the image of thecomponents of the protein layer and subtracting the image of thecomponents of the protein layer from the image derived in said step ofperforming electron microscopy to derive an image of the molecularentities, and performing a single particle reconstruction of the imageof the molecular entities.
 10. The method of claim 1, wherein therotational symmetry of the second oligomer assembly belongs to adihedral point group of order 2 or to a cyclic point group of order 2.11. The method of claim 1, wherein the protomers are homologous withrespect to the monomers.
 12. The method of claim 11, wherein therotational symmetry of the second oligomer assembly belongs to adihedral point group of order
 2. 13. The method of claim 11, wherein thesecond oligomer assembly is a heterologous oligomer assembly of saidsecond monomers and of third monomers, said protein layer furthercomprising said third monomers assembled with said second monomers intosaid second oligomer assembly.
 14. The method of claim 13, wherein thethird monomers are monomers which have a binding site capable of bindingto biotin or a peptide, and said second monomers are aptamers which arecapable of binding to said binding site.
 15. The method of claim 14,wherein said third monomers are streptavidin.
 16. The method of claim14, wherein said second monomers are Streptag I (SEQ ID NO. 3).
 17. Themethod of claim 1, wherein the protomers are heterologous with respectto the monomers.
 18. The method of claim 17, wherein the protein layercomprises protein protomers of two types, the first type of protomercomprising a first monomer which is a monomer of said first oligomerassembly belonging to a dihedral point group of order O, where O equals3, 4, or 6, genetically fused to a second monomer which is a monomer ofsaid second oligomer assembly, said second oligomer assembly being aheterologous oligomer assembly belonging to a cyclic point group oforder 2, and the second type of protomer comprising a third monomerwhich is a monomer of said second oligomer assembly, genetically fusedto a fourth monomer which is a monomer of a third oligomer assembly,said third oligomer assembly belonging to a dihedral point group oforder 2 or O.
 19. The method of claim 18, wherein said oligomer assemblybelongs to a dihedral point group of order O, said third oligomerassembly being the same as said first oligomer assembly.
 20. The methodof claim 1, wherein each of said monomers of said respective oligomerassemblies either is a naturally occurring protein or is based on anaturally occurring protein with peptide elements being absent from,substituted in, or added to the naturally occurring protein withoutsubstantially affecting assembly of monomers of said respective oligomerassembly.
 21. The method of claim 1, wherein, in said protomers, saidmonomers are genetically fused via a linking group.
 22. The method ofclaim 21, wherein the linking group is oriented relative to the firstand second monomers in the protomer in its normal form prior to assemblyto reduce any difference in the assembled layer in either or both of theposition and orientation of (a) the termini of said first monomers intheir arrangement in said first oligomer assembly in its natural formsymmetrically around said one of said set of rotational symmetry axes oforder N of said first oligomer assembly, and (b) the termini of saidsecond monomers in their arrangement in said second oligomer assembly inits natural form symmetrically around said rotational symmetry axis oforder N of said second oligomer assembly.
 23. The method of claim 1,wherein a component of the protein layer has an affinity tag, themolecular entities being attached to respective affinity tags.
 24. Themethod of claim 1, wherein the molecular entity comprises a proteinhaving a peptide affinity tag attached to a component of the proteinlayer.
 25. The method of claim 1, wherein the molecular entity comprisesa protein, and both of a component of the protein layer and themolecular entity have respective affinity tags attached to each other.26. The method of claim 1, wherein the molecular entities aregenetically fused within a component of the protein layer.